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ABSTRACT 

Mass clumps in gravitational lens galaxies can perturb lensed images in characteristic 
ways. Strong lens flux ratios have been used to constrain the amount of dark matter 
substructure in lens galaxies, and various other observables have been considered as 
additional probes of substructure. We study the general theory of lensing with stochas- 
tic substructure in order to understand how lensing observables depend on the mass 
function and spatial distribution of clumps. We find that magnification perturbations 
are mainly sensitive to the total mass in substructure projected near the lensed im- 
ages; when the source is small, flux ratios are not very sensitive to the shape of the 
clump mass function. Position perturbations are mainly sensitive to a characteristic 
clump mass scale, namely rricff — (jn?) / (™), with some mild dependence on other 
mass moments when the spatial distribution is not uniform. They have contributions 
from both "local" and "global" populations of clumps (i.e., those projected near the 
images, and those farther away). Time delay perturbations are sensitive to the same 
characteristic mass, rrics, and mainly driven by the global population of clumps. While 
there is significant scatter in all lensing quantities, there are some non-trivial correla- 
tions that may contain further information about the clump population. Our results 
indicate that a joint analysis of multiple lens observables will offer qualitatively new 
constraints on the mass function and spatial distribution of dark matter substructure 
in distant galaxies. 



INTRODUCTION 

The flux ratios in 4-imaKe gravitational lens systems have been used to constrain the amount of dark m atter substructure in 



lens galaxies (e.g.. lMao fc Schneide3ll998l : iMetcalf fc Madaull200ll : [chiball2002l : lDalal fc Kochanekll2002h . Recently there has 



been considerable interest in identifying other types of lensing observations that can provide additional information about the 
population of dark matter clumps. For example, with compact sources (i.e., quasars) the possibilities include: 

• Flux ratios measured at multiple wavelengths corresporiding to different sour c e sizes (e.g., Moustakas fc Metcah 20031 : 



Metcalf et aLll2004l : Ichiba et ahlliooi: lOobler fc Keetonll2006l: 



• Precise image positions (e.g., Koopmans et al.f 



2002 



MacLe o d et al 
2007|: 



Chen et al 



20091: 



Minezaki et al. 1200! 



Williams et al 



2008 



More et al. 



High-resolution radio interferome t 



Garrett et al. 1994 



substructure (e.g., Yonehara et al.l 



ry to resolve the images into m ultiple "milli-images" (e.g. 



200g). 



Gorenstein et al 



2003 



Zackrisson et al 



19881 : 



Trotter et ahlboool: IRos et ahlboool: biggs et al.ll2004h an d sear ch for poss ble image splittings induced by 



200?: Riehmetal 200S) 



Precise time delays between the images (e.g., Morgan et al. 2006; Keeton fc Moustakas 20091 : Congdon et al. 20091 ). 



With extended sou r ces it is possible to loo k for distortions in the shapes and/or surface br ightness distributions of the images 



(e.g., iMetcaljEooi : llnoue fc Chiball2005bl Q iKoopmanslboOSi : IVeeetti fc Koopmansl[2009al 



The high-level goal of such diverse observations is to measure not only the mean density of dark matter substructure 
but also the mass function, spatial distribution, and perhaps redshift evolution of the clump popula tion. Those are the 



key quantities for testing predictions fro m the Cold Dark Matter paradigm (recent examples include Madau et al. 20081 : 



Diemand et" ar'2008': ISpringel et ahllioOSl ) and placing astrophysical constraints on the particle nature of dark matter (e.g. 
Moustakas et al.ii2009l ). 



© 0000 RAS 



2 Keeton 



The time is ripe to develop a comprehensive theoretical framework for substructure lensing that consolidates the various 
observables and reveals how they depend on physical properties of the clump population. The relevant framework is that of 
stochastic lensing, in which we treat the positions and masses of the clumps as randorr0 variables and compute statistical 
properties of lensing observables. While the application to dark matt er substructure is new, the r e has been considerable formal 
work on stochastic lensin g in the context of stel l ar mi crolen sing. Deeuchi fc Watson (jl987 . IQSSl l computed the variance 
of brightness fluctu ations. Nitvananda fc Ostriker ( 1984 ) and Schneider! j 1987b ) derived the probability distribution for the 
lensing shear, which Schneiderl l 1987ah used to find the probability distribution for the magnification. Several studies considered 
the probabihty d i stribution for lensing deflections especia l ly as they relate to the statistics of m icr olensing light curves 
jKatz et al.lll986l : ISeitz fc Schneider! 1 19941 : ISeitz et al.lfl994l : iNeindorjEooi : iTuntsov fc Lewis! l2006al lbll. IPetters et all (|2009l . 
2008! ) recently began a rigorous program to derive probability distributions for many different lensing quantities for an arbitrary 
number of stars (i.e., not necessarily in the limit of A'^ — > oo). 

All of those studies assumed the mass clumps have a probability distribution that is spatially uniform. While that 
approximation is reasonable on the scales that are relevant for stellar microlensing, it may not be appropriate for applications 
related to dark matter substructure. Our goal in this paper is to generalise aspects of the theory of stochastic lensing to 
allow arbitrary spatial distributions of mass clumps. We also consider arbitrary mass functions, which have been explicitly 
examined in only some of the microlensing work (e.g., Katz et al. 1986! : Neindorj 2003h . 

Regardless of the spatial distribution and mass function, the lensing effects from a collection of mass clumps can be 
written as a superposition of effects from individual clumps. Since the resulting sums contain many random terms, it is 
natural to wonder whether we can invoke the Central Limit Theorem to argue that stochastic lensing can be described in 
terms of (multivariate) Gaussian distributions. The answer, unfortunately, is no, because in some of the sums the individual 
terms have divergent variances. Previous studies typically sidestepped this issue by using the characteristic function method 
to compute probability distributions. That method requires direct and inverse Fourier transforms whose calculations are 
challenging but feasible in the context of a uniform spatial distribution. It is not yet clear how easily the characteristic 
function method can be generalised to arbitrary spatial distributions, so in this paper we adopt an approach that is simpler 
but still very instructive. The trouble for the Central Limit Theorem can be ascribed to strong perturbations produced by 
clumps close (in projection) to an image. We treat these clumps explicitly by deriving full probability distributions for the 
most extreme perturbations produced by individual clumps. Once we have isolated the troublemakers in this way, we can 
apply the Central Limit Theorem to some subset of the remaining clumps; we specifically consider clumps that are projected 
far from an image. We develop formal methods to treat these "local" and "long-range" regimes, which allow us (1) to draw 
general conclusions about how substructure lensing depends on the mass function and spatial distribution of clumps, and (2) 
to obtain analytic results that will serve as useful limits for an eventual complete theory of substructure lensing. 

Let us be clear from the outset that we make certain simplifying assumptions in the analysis presented here. As in all 
previous formal work on stochastic lensing, we approximate the clumps as point masses, and we assume the clumps are 
independent and identically distributed. We gear the discussion toward applications that involve compact (essentially point- 
like) sources. We compute probability distributions for lensing quantities at fixed points in the image plane (rather than for a 
fixed source position), because this is conceptually straightforward and also relevant for interpreting observed lenses in which 
the image positions are known. We address the utility and validity of these assumptions as they arise in the analysis, and 
mention possible extensions to the current work at the end of the paper. 



2 FUNDAMENTALS 



2.1 Basic theory 

All key lensing quantities can be derived from the lensing potential, 4>{x), which is a sc aled version of the two-dimensional 
Newt onia n gravitational p otential and is given by the Poisson equation V (^i = 2k. (See lSchneider et al.|[l993 . iPetters et al.l 
200 1! . and lKochaiie^bood br general reviews of strong lens theory.) Here k = E/Ecrit is the projected surface mass density 
in units of the critical density for lensing, Ecrit ~ [c^ D s) / {4,-rGDiDis) , where A, Ds, and As are angular diameter distances 
to the lens, to the source, and from the lens to the source, respectively. The lensing potential can be used to determine the 
time delay, defined to be the excess travel time compared with a light ray that travels directly from the source to the lens: 
l + zi DiDs 



t{x; u) 



Di, 



1 

— \x 
2 



(x) 



(1) 



where x and u are the angular positions of the image and source, respectively, and zi is the redshift of the lens. By Fermat's 
principle, images form at stationary points of the time delay surface; the condition \/xt{x;u) = immediately yields the 



^ The dark matter substructure in a galaxy is not truly random; in principle it is determined by the galaxy's formation history. However, 
the formation process is sufficiently complicated, and impossible to reconstruct, that it is fair to treat the substructure as random. 
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familiar lens equation, 

u — X — a{x) (2) 
where a = V0 is the deflection vector. The distortion of a (small) image is governed by the magnification tensor, 



M : 



du 
dx 



4'yy 



1 - H~ Jc —"/s 

-7s 1 - K + 7c 



(3) 



where subscripts on (f) denote partial derivatives, e.g., (f>xy = d^(f>/dx dy. In the second expression we have identified k — 
{(pxx + (j>yy)/'2 from the Poisson equation; this quantity is referred to as convergence because it describes the focusing of light 
rays. We have also defined 



7c = 2 (Vxx 



and 



Is = 



(4) 



These quantities are referred to as shear because they describe how an image is stretched by lensing. 



2.2 Smooth and lumpy components 



The potential {(f>), deflection components {ax and Oy), convergence (k), and shear components (7c and 7s) are all linear 
quantities, meaning that each one can be written as a sum of contributions from difl'erent components of the mass model. In 
substructure lensing we write the full lens potential as 

= 0" + 0^ (5) 

The term represents the contribution from a smooth component that contains the majority of the mass of the galaxylf] The 
term (f)" represents the potential from the substructure (which is itself a sum of contributions from many individual clumps; 
see 12. 3p . Since the deflection components, convergence, and shear components are linear in (j>, they can also be written as 
sums of smooth and lumpy terms: for example, we can write the full deflection vector as a = a" + a", and likewise for the 
convergence and shear. 

Any lensing observable that is based on image positions or time delays formally involves the time delay and lens equations 
(eqs.[T]and[2I, which in turn depend on the substructure potential and deflection. Thus, for position and time delay observables 
it is sufficient to focus the theory on understanding probability distributions for 0* and a". 

Lensing observables that involve the shapes and brightnesses of images, by contrast, involve the magnification tensor 
and the scalar magnification, which are nonlinear. To discuss the magnification, it is convenient to define second derivative 
matrices: 



rxy 
ryy 



7s 



-7c 



and r" = 



i^xy 

^yy 



'+7c 
7I 



7s 
«^ - 7c 



The magnification tensor associated with the smooth component alone is (cf. eq. [3} 



(I 



where I is the 2x2 identity matrix. The scalar magnification associated with the smooth component is 



fi° = det 



1 



(1 - ^0)2 - (7c°)2 - (72)2 
The magnification tensor for the full model is 

M = (I - r° - r"^)-' = {(I - r°)[i - (i - r'')"'r']}~' = (i - M°r" 

Here we have used matrix identities to simplify the expression, and recognised the factor M'' = (I — T^') 
expression is useful when we write the scalar magnification associated with the full model: 



= det M = 



det M" 



(6) 

(7) 
(8) 
(9) 

The simplified 
(10) 



det(l-MOr«) det(l - MOP", 
It is interesting now to consider the case when the convergence and shear from the substructure are small: , 7^ , and 7I are 
all O (e) where e ^ 1. While this may not be true in generallf] the perturbative analysis can still offer useful insight. If we 
define the magnification perturbation relative to the smooth model, S/i — fi — fP , we can write the fractional magnification 
perturbation as 
5f^ _ 1 



det(l - MT^) 



: tr(M"r'') + [tr(M°r')2 - det(M°r" 



(11) 



^ The superscript here denotes the smooth component and should not be confused with an exponent. 

^ Indeed, flux ratio anomalies indicate that substructure can have order unity effects on magnifications; B2045+265 provides a good 



example llFassnacht et al.lll999l : iKeeton et al.ll2003f) . 
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The final expression represents a Taylor series expansion in e, using the fact that each component of is O (e) so tr(M''r°) ~ 
0{e), while det(M°r") ~ O (e^) since the matrices are 2x2. (The series expansion would have a different form if the 
matrices had a different dimensionality.) In the case of point mass clumps, k" = so we can write the fractional magnification 
perturbation as 

We can take one more step by defining pseudo-polar coordinates for the shear: 



^ = 2/ {ihc + jhl) + M° [(7c)' + i<f + V (7^e + iH)"] + O {e') (12) 



7c = 7° cos 261" and 7° = 7° sin 26*° (13) 

Here 7*^ and represent the amplitude and direction of the shear from the smooth component; the factor of 2 enters the trig 
functions because shear is symmetric through a 180° rotation (and this is why we call these "pseudo-polar" coordinates). We 
can likewise define 7" and to be the amplitude and direction of the shear from substructure. We can then use the identity 

7c7: + 7°7: = 7°7^ cos 2« - S^) (14) 
to write the fractional magnification perturbation as 

^ = 2/7%^ cos2(e5J - e;) + ^(7=)^ [1 + 4/(7")^ cos^ 2{e° - e;,)] + o {e') (15) 

There are several interesting conclusions to draw from this analysis. First, even the fractional magnification perturbation is 
proportional to f/'; and one of the second-order terms is actually quadratic in f/'. A given substructure shear will therefore 
cause a stronger flux ratio anomaly in a more magnified image. The factor of fjP means the sign of the fractional magnification 
perturbation depends on the parity of the image (at lowest order). The sign is also affected, though, by the cos2(&!J ~ ^7) 
factor. To the extent that the direction of the substructure shear is correlated with the direction of the smooth shear, the 
cos factor will be more likely to have a positive sign, and so positive-parity images {fio > 0) will tend to be brightened by 
substructure [Sfi/fio > 0) whereas negative-parity images (^0 < 0) will tend to be dimmed {Sfi/fio < 0). If the directions 
of the smooth and substructure shears are uncorrelated, however, the fractional magnification perturbation will be equally 
likely to have either sign and so the parity dependence will be masked. Another consequence of the cosine factor is that the 
fractional magnification perturbation is only sensitive to the component of the substructure shear that is "aligned" with the 
smooth shear (for shear, "aligned" means either parallel or perpendicular). The "cross" component of the substructure shear 
(i.e., the component oriented at 45° with respect to the smooth shear) does not affect the magnification perturbation. All of 
these conclusions apply only at lowest order in the substructure shear, so they are not fully general, but they are interesting 
nonetheless. 

Even for arbitrarily large substructure effects, eq. (|ll|l shows that the magnification perturbation can be written in terms 
of the convergence and shear from substructure. More generally, we see from the discussion in this subsection that all of the 
key lensing observables can be written in terms of the substructure quantities = (0", a^, a^, k^, 7^, 7I); thus, if we can 
determine the probability density for (actually, the joint probability density for at all image positions) we will have 
all the information we need to describe lensing observables. Because of the linearity, pi^") does not depend on the smooth 
component, which means that we do not need to discuss the smooth component further. In the remainder of this paper, 
(0, a^,, Qj,, K, 7c, 7s) always refer to substructure quantities, and we drop the superscript "s" to simplify the notation. 



2.3 Complex notation 



Lens theory naturally operates in two dimensions, and it is customary to treat positions and deflections as real vectors in a 
two-dimensional plane. However, it is also possible to think of that plane as the complex plane by combining coordinates x 
and y into the complex number 



w = X + iy 

We can also define the complex deflection and shear, 

a = Ox + io-y and 7 = 7c + i'fa 



(16) 



(17) 



Note that we use tildes to identify these as complex quantities which are distinct from their scalar amplitudes, a = |q| and 7 = 
I7I. In various circumstances the complex notation can simplify lens theory (e.g. , Bourassa et al.l 1973 : Bourassa fc Kantowskil 



I975I : IWittllQQOl : lRhiell997l : IPetters et al.l200ll : lAnl2005l . I2OO7I : Iau fc Evansl2006l : iKhavinson fc Neumannl2006l : lFassnacht et al. 



20071) . and we shall find it to be useful. In complex variables, the potential, deflection, shear, and convergence at position w 
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created by a point mass clump rrii at position Wi have the form: 

= — In- (18) 

TT a 

M^) = — ^— (19) 

TT — UIV^ 

7. w = i J- 20 

Ki{w) = mi5D{w — Wi) (21) 

where the asterisk denotes complex conjugation, and 5d is the Dirac delta function. Also, rrii = Mi/Scrit is the physical mass 
of the clump scaled by the critical surface density for lensing, so it is a quantity with dimensions of area. In the potential, a 
is an arbitrary length scale that can be used to set the zeropoint of the potential. We discuss the zeropoint as needed below. 
The total substructure potential at position w due to all clumps can then be written as 

,^(«;) = ^<^,(w) (22) 

i 

and there are similar expressions for the total deflection, shear, and convergence. 

At times it is useful to convert from the complex notation into conventional polar coordinates. If we write w = r e'^ then 
r and 6 have their usual relation to x and y. We also consider polar coordinates centred on the position of an image by writing 
w' — w ~ Wimg = e'^ . If the image itself has polar coordinates (rimg, ^img), the two polar coordinate systems are related by 

= r?mg + r'^ + 2rimgr' cos(6l' - 6linig) and r'^ = rf^g + - 2rimgr cos(6l - 6limg) (23) 
2.4 Clump population 

In this paper we use point mass clumps, so each clump has a position, Wi, and scaled mass, rrii. The point mass approximation 
is valid for any spherical clump that does not overlap the line of sight, because the gravitational field outside such a clump 
equivalent to that of a point mass clump. There will be some correction for clumps that overlap the line of sight, but we 



leave for future work the development of a general theory that includes spatially extended clumps. (See lRozo et al.i i2006 for 
an analysis of magnification perturbations that can handle extended clumps.) 

The statistical properties of the clump population are specified by the joint probability density p(wi,mi,W2,m2, ■ ■ ■)■ We 
assume the clumps are independent and identically distributed, so the joint probability density can be written as a product 
of probabilities densities for individual clumps: 

p{wi,mi,W2,m2, . . •) = p(mi, mi) x p{w2, 7712) x . . . (24) 

This approach neglects possible correlations among clumps, but even if such correlations exist in three dimensions they will 
be suppressed to some extent by projection effects in lensing. Correlations among clumps would further complicate the theory 
and so we defer them to future work. We also assume the positions and masses are separable, so we can write 

p{wi,mi) = pn,{wi)pm{mi) (25) 

Here Pw{w) is the spatial probability density, defined such that Pw^w) dw is the probability that a given clump is found within 
some area element dw around position w (in the complex plane). Also, Pm{rn) is the mass probability density such that 
Pm{m) dm is the probability that a given clump has mass between m and m + dm; the equivalent clump mass function is 
dN/dm — N pm{m). Numerical simulations suggest that mass and position may not be fully separable; clumps in different 
mass ranges tend to have somewhat different spatial distribution s due to effects such as tidal stripping and disruption (e.g. 



Ghigna et aPboOol : IPe Lucia et ahlbood : iNaeai &: Kravtsov l boosi ). Such an effect could be accounted for in our formalism by 



dividing the clump population into several sub-populations by mass and using a different spatial distribution for each. 

To gain a more intuitive understanding of the spatial probability density, consider the mean surface mass density in 
substructure: 

Ks{w) = {k{w)) = N {m) pw{w) (26) 

Here "mean" refers to averaging over many realizations of the substructure, in a sense that is formalised in Appendix |A] (see 
eq. lAll with fi m (w) = mi Sd {w — Wi) from eq. I2ip . Later in the paper we use this relation to replace the abstract quantity 
Pw{w) with the more physical quantity His(w) in spatial integrals. 

2.5 Sample models 

Even as we attempt to build a general theory of lensing with stochastic substructure, there are times when it is helpful to 
consider specific sample clump populations. For the spatial distribution, one simple model we consider is a uniform distribution, 

^^W-^ (27) 
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Table 1. Sample clump mass functions. All have power law index f) = —1.9 and are normalised to have the same effective mass, 
meff = (™^) / (™)- Column 1 gives the dynamic range, q = m^lmx. Column 2 gives the lower end of the mass range. Column 3 gives 
the mean mass. The remaining columns give higher moments (taking appropriate powers so each quantity has dimensions of mass). All 
masses are given in units of 10** Af©- 



<? 


A/i 


(A/) 








1 


4.58 


4.58 


4.58 


4.58 


4.58 


3 


2.49 


4.14 


4.35 


4.57 


4.77 


10 


1.12 


3.00 


3.70 


4.43 


5.08 


30 


0.496 


1.90 


2.95 


4.10 


5.14 


100 


0.187 


1.00 


2.14 


3.56 


4.89 


300 


0.0731 


0.509 


1.53 


2.99 


4.47 


1000 


0.0251 


0.225 


1.02 


2.39 


3.90 



defined over some large but finite area such that J pw(w) dw = 1. The total number of clumps A'^ is basically interchangeable 
with the area over which pw is defined. As discussed in the Introduction, the spatially uniform case allows us to connect with 
previous work on stochastic lensing. A second spatial model we consider is a power law distribution, 

oc with < < 2 (28) 

The power law index is chosen such that the total mass in substructure within projected radius r scales as M(r) cx r''. In 
terms of image-centred polar coordinates we can write 

1 {7,-2)/2 

(29) 



1 + 



2 cose' 



While the uniform and power law models do not necessarily match simulated CDM substructure populations in detail, they 
are useful pedagogical examples and sufficient for our purposes in this paper. 

CDM sim ulations predict that the mass function is a power law over many orders of magnitude, dN/dm oc m'' with 



(3 ~ —1.9 (e.g., Madau et al. 20081 : Springel et al. 20081 ). To understand how substructure lensing depends on clump mass, it 



is instructive to consider a finite mass range mi ^ m m2. The mean clump mass is then 



(m) 



nil 



-13 



where q 
form 



2 + /3 ,^+.^-1 

7712/7711 is the dynamic range of the mass function. More generally, other moments of the mass function have the 



nil 



1+P 

n + l + l3 q^+l^ - 1 



As we shall see, there is one combination of mass moments that arises repeatedly: 



TTleff 



(m) 



(m). 



1 q 



.3+0 



1 



l+P 3 + f3 



2 + f3 

,2+/3 _ - 



(31) 



(32) 



This quantity has dim ensions of mass, and has been referred to in microlensing as the "effective mass" (e.g.. lRefsdal &: Stabell 
I991I : iNeindorjIiooi ) . 



We define a fiducial model whose spatial distribution is a power law with 7; = 1 (corresponding to an isothermal profile), 
and whose mass function has mean mass (A/) — 10* Mq and dynamic range q = 100. We then construct variants with a 
steeper or shallower radial profile {-q — 0.5 or 1.5, respectively); all models are normalised to have Ks — 0.01 in the vicinity 
of the lensed images. We also construct other mass functions that have different dynamic ranges but are all normalised to 
have the same effective mass (see Table [!}. Note that the quantitative details of our sample models are not important; what 
matters is that the examples illustrate the key concepts and scali ngs derived from our mathematical analysis. 

We consider a cosmology with Qm ~ 0.274 and f^A = 0.726 jKomatsu et al. 2009). We set the lens and source redshifts 



198C; 



Henry fc Heaslev 19861 ). Then the cosmological 



to zi = 0.31 and Zs = 1.722 (as for PG 1115+080; IWevmann et al. 
distances are Di — 662/i~^Mpc, Dg — 1248/i~^Mpc, and Dis = 930/i~^Mpc, and the critical density for lensing is Ec 
3378 ^Mq pc"^ or equivalently EcHt = 3.48 x 10^° h'^ Mq arcsec"^. 



3 LOCAL ANALYSIS 

In the Introduction we noted that the probability distributions for the substructure shear and defiection have divergent 
variances. In this section we analyze the "heavy tails" that cause the variance to diverge, by computing the probability 
distributions for the most extreme shear, defiection, and variance. (We do not consider the convergence because it is zero for 
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point mass clumps; see eg. 1211 ) We seek to understand how the mass function and spatial distribution of the clump population 
affect the strongest substructure perturbations. For the analysis in this section we focus on one particular image and compute 
one-point statisticslf] We work in coordinates centred on that image, w' = Wi — Wimg, which allows us to write the amplitudes 
of the shear and deflection from clump i as 

7» = l7i| = n~7T7 and a, = |Qi| = ^^ (33) 
and the potential from clump i as 

TT a 

where again a is a length scale that sets the zeropoint of the potential. 



3.1 Uniform spatial distribution 

We begin with the simple case of a uniform spatial distribution to introduce the methods. From eq. (|27p . the spatial probability 
distribution is Pm{w) = Rs/N (m). When A'^ is finite the area over which Pw{w) is defined is also finite (albeit large), but we 
will not be too concerned with the boundary because we will eventually take the limit A'^ —* oo. 

First consider the shear. We seek to determine the probability distribution for the largest shear produced by any individual 
clump. Before we can do that, we need to find the probability that the shear from a given clump i is bigger than 7: 

Pi{>l) = / Pw{w'i) pm{mi) dw'i drrii = / dm Pm{m) / dd' / dr' r' = (35) 



N{m) J -■--'"v-/y - - - 



Here we have written the expression for the total probability in the region where 7; > 7, then plugged in for Pw(w') and 
written the w' integral in terms of polar coordinates (using w' — r' e'^ ), and finally evaluated the integrals. Strictly speaking, 
this analysis is not valid to arbitrarily small values of 7, because small shears can only be produced by clumps that are far 
from the image, and the clump domain has some finite extent. Put another way, -Pi(>7) cannot exceed unity, so clearly this 
expression is valid only for 7 ^ Rs/N . This detail will become immaterial when we take the limit A'' 00. 

Now, the probability that the shear from clump i is smaller than 7 is obviously 1 — Pi(> 7). Since the clumps are 
independent, the probability that the shears from all clumps are smaller than 7 is obtained by taking the product of all the 
individual clump probabilities: 

P.n«-i) = Y[[^-n>l)] = [^-^) -exp foriV^oo (36) 



iV7 

This is the cumulative probability distribution for the largest shear amplitude. Notice that the clump mass function dropped 
out of the analysis: the probability distribution for the largest shear does not depend on ho w the clumps are distributed in 
mass, when the spatial distribution is uniform (also see Schneider 1987b : Fetters et al. 2008[ l. 



Next we consider the probability distribution for the largest deflection amplitude. The argument proceeds just as before. 
The probability that the deflection from a given clump i is bigger than a is: 

Pi{>a) = / Pwi-w'i) pmijrii) dw'i dm,i = ] I dm Pm{m) I dO' I dr' r' Rs = ^ , -; — r i^'^) 

' - N (m) J J Jo Nna^ (m) 



J dm Pm{m) 








is then: 




j-\ for iV - 


■* 00 



The probability that all deflections are smaller than a is then 

This is the cumulative probability distribution for the largest deflection amplitude. Here we see that the probability distribution 
for the largest deflection does depend on the clump mass function, but only through the combination of moments known as 
effective mass, meg = ("^^) / {^)- This is consistent with previous results showing th at the deflection probability distr ibution 



enective mass, meg = / / v^l- J^ns is consistent witn previous results snowing tn at tne aenection proDaDility distr i 
depends on the mass function through meff when the spatial distribution is uniform (jKatz et al.lll98 i: iNeindorjliooi ) 



Finally we turn to the potential. The only difference in this case is that the potential increases with the distance of the 
clump from the image, so the inequalities are reversed. Specifically, we want to determine the probability distribution for the 



* In this section we do not consider two-point statistics because we expect the local effects for different images to be (largely) independent: 
the clump that produces the largest shear for one image is not likely to be the clump that produces the largest shear for another image. 
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"strongest" (i.e., most negative) potential. The probability that the potential from a given clump i is less than is: 



P^(< 



{<(t>) = / , ,, Pwiw'i) Pmirrii) dw'i drrn = ] . / dm pm{m) / d9' / dr' r' Ks 

N{m)J J Jo 



(39) 



3 ^' pm(m) dm 



N{m) 

The probability that all potentials are larger than </!> is then (for A'^ — > oo) 

^' pm[m) dm 



-Paii(><?!') = exp I ; — ^ / e 



(m) 



(40) 



This is the cumulative probability distribution for the strongest potential. In this case we cannot express the mass integral in 
terms of any simple moment of the mass function. 



3.2 General case 

For an arbitrary spatial distribution, the analogues of eqs. (|35[). (|37p . and (|39[l are 

Pw(Wi) pm(mt) awi ami = — — / am pm{m) j aC' 



P.(>7) 



P^(>a) 



Pwiw'i) pm{mt) dw'i dm-i = | , / dm pm{m) ( dO' ( dr r' Ks{r',0') 

N (m) J J Jo 

Jiu{w[) pm{mi) dw'i dmi = | / dm pm{m) I dO' I dr' r' Rs{r' ,0') 

N [m) J J Jo 

/ , P-wi'w',) Pm{mi) dwi dmi = — r / dm Pm{m) / dO' / dr' r' Rs(r' ,9') 

N{m)J J Jo 



(41) 
(42) 
(43) 



Notice that the form of the integral is the same in all three cases; the only difference is the upper limit of the radial integral. 
Therefore let us define 

Q{z) ^ J de' dr' r' ks{r',9') (44) 

where ks{r',9') = (r', &')/«:s,img is the mean substructure density at the position specified by {r',9') normalised by the 
value at the image. Normalising by Rs,img, means that the function Q{z) is independent of the amount of substructure and 
depends only on the spatial distribution of the substructure population. Note that for the uniform case, Q{z) — tvz^. Now we 
rewrite eqs. (|4ip - (|43p using Q{z), and then repeat the argument that takes us from eq. (I35p to eq. (|36p . to obtain general 
expressions for the probability distributions for the local shear, deflection, and potential (in the limit — > oo): 



P(<7) — exp 
P{<a) — exp 



P(>r 



exp 



(m) 



(m) 



(m) 



— Pm{m) dm 
7r7 



Q {—] Pmijn) dm 
\Tva/ 

Q (^ae"^^"^^ Pm{m) dm 



(45) 
(46) 
(47) 



One conclusion we can draw from these expressions is the general form of the tails of the probability distributions for shear 
and deflection. At lowest order in z we have Q{z) oc z^ which immediately yields 

p{l) = '^^^ « 7"' (7 - (48) 

^ dP{<a) ^ ^_3 ^ 
da 

(Note that these are one-dimensional probability distributions for the shear and deflection amplitudes, not two-dimensional 
distributions for the full shear and deflection.) These scalings have been derived before for a uniform spatial d istribution of 
equal-mass clumps ( Nitvananda fc Ostrikei 1984 : Katz et al. 19861 : Schneider 1987bl : Fetters et al. 20091 . 20081 ). but now we 
see that they are quite general. This is not surprising: since large shears and deflections can only be produced by clumps in 
the vicinity of an image, the tails of the shear and deflection distributions cannot be very sensitive to the global population 
of clumps. 



3.3 Power law spatial distribution 

So far we have examined a uniform spatial distribution, which is simplistic, and the arbitrary case, which yielded expressions 
that are fully general but not especially enlightening. We now split the difference by considering a power law spatial distribu- 
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Figure 1. Cumulative probability distributions for the "local" (i.e., most extreme) potential, deflection amplitude, and shear amplitude. 
In the text we computed the potential distribution as P{> </>), but here we plot P{<(j>) = 1 — P(> (p) for consistency with the deflection and 
shear plots. The different curves correspond to different mass functions, all normalised to have the same effective mass, m^fj = (m^) / (m) 
(see Table[TJ. The spatial distribution of clumps is a power law with r] = I and Ks = 0.01 at the position of the image we are examining. 
In the deflection and shear panels, the cyan dot-dash curves show the theoretical predictions from eqs. I I38II and 11361 . The zeropoint of 
the potential is chosen so the mean of each potential distribution is zero. 



tion, which is still somewhat simplified but certainly better tlian tlie uniform case, and still tractable. From eq. (|29|l we can 
write the spatial factor that appears in Q{z) as 



(t,-2)/2 



(50) 



For general rj we cannot evaluate the integrals in Q{z) analytically. However, we can make a Taylor series expansion in z that 
gives a useful approximation in the limit of a large shear or deflection. The series expansion for Q{z) is 



Q(z) 



1 + 



+ 



192 



+ 



(2-^)^(4-r,)^(6-r,)^ 
9216 



+ 



(51) 



We use this expression in eqs. (|45|l and (|46|l . and recognise that the mass integrals yield mass moments of the form 
J m" pm.(m) dm — {m" ). This yields series expansions for the probability distributions for the strongest shear and deflec- 
tion: 



Pi<a) 



+ 



7 

(m^) lis 



(m> 



(m) 



8-Krf, 



K) iri-2f 
(m) 47rr? 



(52) 



(53) 



Notice that to lowest order (i.e., for large local shear and deflection), the shear distribution is independent of the mass 
function while the deflection distribution depends on meff = ('ti^) / (m), and neither is sensitive to the spatial distribution 
of substructure (i.e., to ri). This makes sense physically, because large local shears and deflections can only be created by 
clumps relatively close to the image, and so the tail of the probability distribution depends only on the local abundance of 
substructure (i.e., Ks.img). The spatial distribution enters through higher-order terms, in combination with other moments of 
the mass function {(rn?) / {m) for the shear, and (ti*) / (m) for the deflection). Thus, we find that the shear and deflection 
distributions are formally sensitive to the spatial distribution of substructure along with various moments of the mass function. 
In practice, however, those sensitivities are important only at relatively low values of the local shear or deflection. 

We do not attempt a similar analysis for the potential because it is not possible to express the integral over the mass 
function in terms of simple mass moments. The implication is that probability distribution for the local potential is sensitive 
to the full shape of the mass function. 



3.4 Examples 

We now use the sample models discussed in §[23]to illustrate the results from this section. We emphasise that the quantitative 
results presented here depend on our choice of sample models and should not be taken as detailed predictions; our goal with 
these examples is only to illuminate the concepts drawn from our formal analysis. 
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Figure 2. Similar to Figure [T] but the mass functions are normalised to have the same mean mass, (m) (instead of the same effective 
mass, m^ff). 




Figure 3. Similar to Figure[T] but now with different power law indices for the spatial distribution of clumps: Ks oc ^. Results are 
shown for a mass function with dynamic range q = 100. 



We generate 10* Monte Carlo realizations of clump populations, tabulate the most extreme substructure terms (the 
smallest potential, the largest deflection amplitude, and the largest shear amplitude) from each one, and then plot the 
resulting probability distributions. For example. Figure [T] shows results from simulations with different clump mass functions, 
when all the mass functions are normalised to have the same effective mass, iricft = (m^) / (m). For comparison, Figure [2] 
shows the corresponding results when all the mass functions are normalised to have the same mean mass, (m). In both cases 
the spatial distribution of clumps is a power law with rj — 1 and Rs = 0.01 at the position of the image we are examining. 

The direct simulations show that the local shear distribution is essentially identical for all mass functions, which is 
consistent with our theoretical expectations; and we see that eq. (|36p matches the simulated shear distribution very well. 
When we fix rrics, the local defiection distribution is basically independent of the clump mass function at large defiections, 
and only weakly sensitive to the mass function at small deflections. By contrast, when we fix (m) the defiection distribution is 
very sensitive to the mass function. Clearly it is the effective mass, and not the mean mass, that is the important property of 
the mass function in terms of predicting the local defiection distribution. Furthermore, we see that eq. (|38p gives a very useful 
approximation for this distribution, except at small defiections (which are probably not of great interest anyway). There is 
no obvious, simple scaling for the local potential distribution, although it does appear that the effective mass is more useful 
than the mean mass for estimating this distribution. 

Next we consider varying the spatial distribution of mass clumps, as shown in Figure |3] The local shear distribution 
is hardly affected except for small changes at small values of the shear. The local defiection distribution is somewhat more 
sensitive to the spatial distribution, especially at modest defiections, although eq. (|38|l remains a useful approximation. Both of 
these results are consistent with our theoretical expectations. One striking result we did not predict is that the local potential 
distribution is quite insensitive to the spatial distribution of clumps (at least when the potential zeropoint is chosen such that 
the mean local potential is zero). We do not currently have a formal explanation for this empirical result. 

Finally, one thing we can do with the simulations that we could not do with the theory is determine how important 
the local effects are compared with the total effects due to the full clump population. Figure |4] shows cumulative probability 
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Figure 4. Comparison of tlie cumulative probability distributions for the local potential, deflection, and shear (red dotted curves) with 
the corresponding distributions for the total potential, deflection, and shear (blue solid curves). Results are shown for a mass function 
with dynamic range q = 100, and a power law spatial distribution with rj = 1 and hs = 0.01 at the position of the image. The zeropoint 
of the potential is again chosen so the mean of each potential distribution is zero. 



distributions for the local potential, deflection, and shear alongside the corresponding distributions for the total potential, 
deflection, and shear. (For the total deflection and shear, we properly sum the complex terms and then take the amplitude of 
the final sum.) We use the fiducial model in which the mass function has a dynamic range q — 100, and the spatial distribution 
is a power law with rj = 1 and Ks = 0.01 at the position of the image. The local and total shear distributions a re not very 



differ ent, especially at large shears, which indicates that the total shear is dominated by local effects (also see iRozo et al 
20061 ) . The only difference is when the local shear is small or modest, which leaves room for the combined effects of more 



distant clumps to become noticeable. By contrast, there is a more significant difference between the local and total deflection 
distributions; in other words, there are comparable cont ributions to the deflection from the clump(s) nearest to the image 
and from clumps farther away (also see lChen et al.ll2007l ). As for the potential, the local effects are almost entirely negligible 
compared with the combined effects of many distant clumps. 

These results represent the flrst stage of our conclusions about how lensing quantities depend on the mass function and 
spatial distribution of clumps. They are placed in a broader context in §[S] below. 



4 LONG-RANGE ANALYSIS 



We now turn attention to clumps that are far from the images. The potential, deflection, and shear from these clumps can 
be written as sums of many contributions, all of which are finite, so the Central Limit Theorem suggests that their joint 
probability density can be approximated as a multivariate Gaussian distribution. (Again, we do not consider the convergence 
because it is zero for point mass clumps; see eg. 1211 ) Under this approximation, all we need to know are the mean vector and 
covariance matrix. In this section we compute those quantities for the population of clumps outside some radius Ro fi'om the 
centre of the galax;y. For simplicity we assume the clump population has circular symmetry, so Rs(w) is only a function of \w\, 
but we consider arbitrary radial distributions. 

Formally, "mean" and "covariance" refer to averages over many realizations of the clump population. Details of the 
averaging process are discussed in Appendix|X] The key expressions are eq. (|A2|I for the mean of any quantity /, and eq. (|A4|I 
for the covariance between two quantities / and g. (Note that we can compute the variance of / as cov(/, /).) The expression 
for the covariance is an approximation that is valid when the number of clumps is large. 

Given circular symmetry, the means are trivial: {(f>), (a), and (7) all vanish due to the well-known result that there is 
no gravitational force inside a circular shell. Strictly speaking all we can say about the potential inside the shell is that it is 
constant, but that constant can be absorbed into the zeropoint. In this section we sidestep the zeropoint by working with the 
differential potential relative to the origin, i.e., "i^" here actually stands for (^(wimg) — 4>iO)- 

To illustrate the covariance calculation let us consider the covariance between the deflections at two images (3i at position 
wi, and Q2 at position W2)- Using eq. (|A4|) we have: 

1 1 - . 



cov(dfi, a2) 



2mcf 



|>-Ro 



71(^1 J' 



') 71(102 — Wi) 



dwi 



(54) 



27r 



dn 



+ 



wie 



2i0i 



+ 



+ 



+ 



W2e 



+ 



+ 



-K2 (1 + K4,wlw2 + . 
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Table 2. Images positions for the sample lens use d in Figure [5] The positions are in arcseconds relative to the centre of the lens galaxy, 
and the images are listed in time delay order. See lKeeton fc Moustaka3 ||2009| ) for more details. 



image 


X 


y 


Ml 


0.343 


1.360 


M2 


-0.948 


-0.697 


SI 


-1.098 


-0.206 


S2 


0.700 


-0.652 



In the second line we explicitly consider only clumps outside Ro- We assume that Rq is well outside the images (r^ > -Ro ^ 
|u;2|) and make a Taylor series expansion in l/n. We can then evaluate the angular integral and reduce the expression 
for the covariance to integrals over the spatial distribution of substructure with different radial weighting: 



K2 = r^^ndn (55) 



^4 = -^r^-^rf-^ (56) 

-"^2 rf 

Note that K2 is dimensionless, while K4, has dimensions of 1/length^ and scales as K4 oc Rq^. Assuming the two images are 
in the vicinity of the lens galaxy's Einstein radius, \wi\, \w2\ ~ -Rein, the second term in parentheses in eq. (|54|l is of order 
{Rein/ Ro)^ , and any additional terms would be corrections of order (Rdn/Ro)'^ and higher. 

Now if we consider the potential, deflection, and shear for two different images and assemble them into a (complex) vector 
V — 3i, 71, (?!)2, S2, 72), then we can use a similar calculation to obtain the full covariance matrix: 

C^^K2X (57) 

TV 

|wil2(l + iif4kl|') Wt [1 + ^K4\wi[') iif4(Wl*)' ^ {wlw2 {1 + ^K4wtw2)) wt [l + ^ K^wtwi) ^K^wtf 

2 {1 + Kilwil'^) 2KawI W2{l + \Kiwlw2) 2(1 + KawIw2) 2KawI 

2K4 2KaW2 2K4 

\w2mi + IK4\W2\^) W*2{l+^K4w2f) ^,Ki{w*2f 

2{l + K4w2\'^) 2K4-w'i 

2K4 

Here 5R denotes the real part. The lower triangle can be flUed in using the fact that the covariance matrix is Hermitian. As 
before, there are correction terms of order (Rdn/ Ro)'^ and higher. 

There are two general conclusions to draw from this analysis. First, the clump mass function enters the covariance matrix 
only through the effective mass, rrics = {m^) / (jn)- Second, we see explicitly how the spatial distribution of clumps influences 
the long-range effects. In covariances that only involve potential and/or deflection, the lowest-order term depends on 

K2 = r dn (58) 

SO distant clumps are effectively weig hted by Of course, this weighting is compensated to some extent by the fact that 

there is more area at large radius (rdr). The net effect is that K2 diverges logarithmically at large radius if the substructure 
population is spatially uniform, while K2 tx Rq~^ for the more realistic case of an asymptotic power law Rs oc r''"^. In 
covariances that involve a shear, the lowest-order term depends on 

K2K4 = r n dn (59) 

(The factor of K2 comes from the leading factor in eq. 1571 ) These terms clearly have less sensitivity to distant clumps. 

We now use our fiducial clump model from § 12.51 to construct a quantitative example. Since we are dealing with two- 
point statistics we must sp ecify the positions of the images. For illustration we use the sample "fold" lens defined by 



pomt statistics we must sp ecity ttie positions ot the images, tor illustration we use ttie sample told lens detmed by 
Keeton fc MoustakasI (2009), which is similar to the observed lens PG 1115-1-080. The images (which are listed in Table [2)| 



correspond to a lens comprising a singular isothermal sphere with external shear, which has an Einstein radius -Roin = 1.16 
arcsec. We run 10^ Monte Carlo simulations of clumps at radii 3-Rcin < r < lOO-Roin, using a power law spatial distribution 
with 7] = 1 normalised so Ks — 0.01 at the Einstein radius. We also compute the analytic covariance matrix from eq. (|57p . 

Figure [S] shows covariances among the potential, defiection, and shear for a single image (upper triangle), as well as 
covariances between two different images (lower triangle). For clarity we use the two real components of deflection and shear 
(instead of complex variables) in the example. We have chosen images Ml and SI for illustration, but the results are similar 
for other pairs. One key result is that the direct simulations validate the analytic covariance matrix. A second result is 
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Figure 5. Illustration of elements of the covariance matrix for long-range eflfects. The upper triangle shows covariances among the 
potential, deflection, and shear for a single image (Ml), while the lower triangle shows covariances between two different images (Ml 
and SI). In each panel, the points show results from 10^ Monte Carlo simulations of clumps at 3-Rcin < r < 100i?cini using our fiducial 
example in which the mass function has a dynamic range q = 100 and the spatial distribution is a power law with rj = 1 and Rs = 0.01 
at the Einstein radius. The curves show the 1-, 2-, and 3-cr contours predicted from the analytic covariance matrix (eg. I57I I. The a values 
give the standard deviations in the horizontal {"x") and vertical {"y") directions, while p gives the Pearson product-moment correlation 
coefficient for the two quantities: p = cav(x,y) / a^cry . The specific values of the correlation coefficients depend on the choice of inner 
radius (i?o = 3-Rein here). 



that there are non-trivial correlations among the lensing quantities. For a single image, the degree to which the potential is 
correlated with the different components of the deflection depends on the position of the image; because of its position near 
the j/-axis, image Ml has a stronger correlation between <^ and ay than between (j) and ax. For different images, the various 
correlations emerge because a distant clump can affect nearby images in a similar way; although with a large number of 
clumps there is always some stochasticity and the correlations are never perfect. The correlations depend on the inner radius 
of the region containing clumps (the cut-off radius Rq above); if we make Ro large and consider only distant clumps, the 
correlations will be strong because (again) distant clumps affect the images in similar ways; whereas if we make Ro small and 
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Figure 6. Cumulative probability distributions for the total potential, deflection amplitude, and shear amplitude. The different curves 
correspond to different mass functions, all normalised to have the same effective mass, rriefl = (™'^) / (™) (^^^ Table [TJ. The spatial 
distribution is a power law with rj = I and Ks = O.Oi at the position of the image. We now plot the differential potential (relative to the 
origin), so the zeropoint is irrelevant. 



consider relatively nearby clumps as well, the correlations will be weaker because nearby clumps affect the images differently. 
It would be interesting in follow-up work to consider what physical information about the clump population is contained in 
the correlations among lensing quantities. For now, we consider that this example has provided a useful demonstration of the 
concepts derived from our formal analysis of long-range effects. 



5 DISCUSSION 

In our formal analysis we have so far considered clumps near an image (§ |3]) and clumps far away (§|4|. While we do not 
yet have a full theory for the total effects from all clumps (including those at intermediate distances) □ we can examine such 
effects numerically with our Monte Carlo simulations. Our goal in this section is to see whether the inferences we have drawn 
from the local and long-rage analyses extend to the effects from the full clump population. 

One set of inferences involve the mass function of clumps, so we first study how different mass functions affect the 
probability distributions for the total potential, deflection, and shear. Figure |6] shows results from simulations with different 
mass functions when they are normalised to have the same effective mass, rricff. (This is analogous to Figure[T] but now showing 
total substructure effects instead of just local effects.) First consider the shear. We have found that local effects dominate 
the shear, and local effects are insensitive to the clump mass function. To the extent that long-range effects contribute to the 
shear, they introduce a dependence on rricff. Thus, it is no surprise to see in Figure |6] that the total shear distributions are 
basically indistinguishable for the different mass functions. For comparison, previous analytic results for a uniform spatial 
distribution of point masses (i.e., microlensing) found that the shear dist ribution is strictly i nsensitive to the mass function, 
at least when the number of clumps is sufficiently large (Schneider 1987b; Fetters et al. 20081 '). Rozo et al. 

I I2OO6.) argued that 



the shear and magnification distributions can depend on the clump mass function when the clumps are spatially extended, 
although sim ulations of sub structure len sing indicat e that any dependence on the mass function is not strong (when the source 



is smaU; e.g. JPalal fc Kochan ek 2002; S hin fc Eva ns 2008). 



Next consid er the defiection. We have found that local and long-range effects are comparable in importance (also see 



Chen et al.ll2007l ). The deflection distribution depends on the mass function onl y thro ugh mefi in the case of a uniform spatial 



distribution; see eqs. (|38|l and (|57p . and also iKatz et al.l (|l986) and Neindorii ( 20031 ) . Other mass moments enter when the 



spatial distribution is not uniform (see ea. l53|l . but Figure |6] shows that the deflection distribution is still only mildly sensitive 
to the mass function. Finally consider the potential. We have found that local effects in the potential are negligible, and 
long-range effects depend on the mass function through meff. This explains why the potential distribution shows only smaU 
variations with the mass function. Now, it is an oversimplification to say that moff is the only important property of the 
mass function, especially when discussing the deflection and potential; there are some residual variations in the probability 
distributions, which presumably arise from intermediate-scale clumps not yet treated in our formal theory, that will need to 
be incorporated into any detailed, quantitative study of substructure lensing. Nevertheless, as a conceptual rule of thumb it 
seems fair to say that the deflection and potential are mainly sensitive to the effective clump mass. 

The other set of inferences involve the spatial distribution of clumps, so Figure [7] shows probability distributions for the 

® Re call that the net e ffects from all clumps have been studied for the case of a uniform spatial distribution, most recently in the work 
bv lPetters et al.l l|20oi, [2003) • 
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Figure 7. Similar to Figure |6] but now for different power law indices for tlie spatial distribution of clumps: Ks qc ^. Results are 
shown for a mass function with dynamic range q = 100. 



total potential, deflection, and shear from simulations with different power law radial profiles. We have already seen that local 
effects are not very sensitive to the power law index rj (see Figure |3]). The total shear is still only modestly sensitive to 77, and 
only at the low end of the shear distribution; the reason is that large shears are dominated by clumps in the vicinity of an 
image. The total deflection is more sensitive to rj, because it contains a signiflcant contribution from non-local clumps. And 
the total potential is very sensitive to the spatial distribution of clumps, since it is dominated by non-local effects. 



6 CONCLUSIONS 



We have developed certain aspects of the theory of gravitational lensing with stochastic substructure in order to understand 
how information about the population of mass clumps is encoded in various lensing observables. Specifically, we have derived 
probability distributions for the potential, deflection, and shear produced by the clumps in the vicinity of a lensed image; 
and we have computed the covariance matrix for a multivariate Gaussian distribution representing the potential, deflec- 
tion, and shear due to clump s far from the lensed images. This analysis extends previous work o n stochastic microlensing 



jNitvananda fc Ostriket 



Seitz et al.lll994 



Neindori 



1984 

r 



200: 



■Katz et al ._ 

ij Tuntsov fc LewisI 



19861: IPeguchi fc WatsonI 



2006; 



1987 



Fetters et al 



19881: ISchneiderlll987bl B ISeitz fc Schneidejll994 
20081 ) bv allowing both general spatial distri- 



200E 



butions and mass functions for the clump population. 

We have drawn two main conclusions about the clump mass function. First, the probability distribution for the local shear 
is strictly independent of the clump mass function for a uniform spatial distribution, and essentially insensitive to the mass 
function for more general spatial distributions. Since the total shear is dominated by clumps near the image, the probability 
distribution for the total shear is effectively insensitive to the mass function as well. Second, the probability distributions 
for the potential and deflection depend on the mass function mainly through the effective mass, nicfi = {m?) / {m). There 
are some higher-order effects that depend on other moments of the mass function, but the principal scaling for both the 
potential and deflection is with mcs- Our conclusions about the shear and deflection generalise previous results on stochastic 
microlensin g with an infinite, uniform distribution of stars I Katz et al. 19861 : Schneider 1987bl : Neindori 20031 : Fetters et al, 
20091 . l2008l ). 



The spatial distribution of clumps has little effect on the probability distributions for the local shear and potential, and 
only a modest effect on the probability distribution for the local defiection. (It is no surprise, of course, that "local" effects 
are not very sensitive to the global distribution.) Even the total shear distribution has only a modest sensitivity to the spatial 
distribution of clumps. The total defiection distribution, by contrast, changes more significantly when we vary the spatial 
distribution; and the total potential distribution is very sensitive to changes in the global population of clumps. 

The high-level conclusion from this work is that different lensing observables depend on the clump population in different 
ways, as summarised in Table [3] If we can measure not only magnifications (in practice, magnification ratios) but also image 
positions and time delays well enough to detect substructure effects, we will gain the ability to probe substructure in lens 
galaxies in new ways and extract additional information about the clump populations. 

There are, to be sure, some limits to the analysis presented here. First, as in previous work on stochastic microlensing, we 
have explicitly assumed point mass clumps and neglected any correlations between clumps. While the point mass approxima- 
tion is fine for clumps projected far from the lensed images, it will break down for clumps that overlap the line of sig ht; this will 



have the greatest effect on the substructure shear, since that is mainly associated with clumps near an image (see lRozo et al 



2OO6II . It will be interesting in fu ture work to seek a gene ral theory of s tochastic le nsing that can handle "puffy" clumps (as in 
the substructure simulations bv lMetcalf'fc Madauii200ll : IPalal fc Kochanekii20o3 : iMaccio fc Mirandalliooi : Ichen et al.ll2007l : 
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Table 3. Heuristic guide to complementarity in substructure lensing. In Column 2 we use the scaled radius r = r/iJeim where r is the 
distance of a clump from the image and i?oin is the Einstein radius of the clump (not of the macromodel). 



observable 


mnemonic 


mass scale 


spatial scale 


magnifications 
positions 
time delays 


S-y " 
Sx 
S<p '~ 


^ l/r2 

- Rein/f 


f m 4^ dm 

J dm 

(m^) / (m) 
(m^) / (m) 


quasi-local 
intermediate 
long-range 



Shin fc EvansI 20081 ) and allow correlations between clumps. Second, this paper has implicitl y focused on lensing of a point- 



like source. There are ways to treat an extended source i n the context of microle nsing (e.g., Deguchi fc Watson 1987 . 19881 : 
Seitz fc Schneideilll994 ISeitz et al.lll994 iNeindor j [iooi : iTuntsov fc Lewis! boOGal lbh. although the issues may be somewhat 
different in substructure lensing since the optical depth is generally lower and there can be compli cated — and interesting — 
"resonances" that appear when the source size is comparable to the Einstein radius of a clump (see Dobler fc Keeton 200g). 
Once finite source effects are i ncorporated, this form alism could be applied to other problems in stochastic lensing such as 
planetesimal disk microlensing ( Heng fc Keeton 2009l l. 

There are several attractive opportunities to extend this work. We have already noted that there can be correlations among 
the various lensing quantities (see §|4|, and it would be interesting to see whether such two-point statistics contain additional 
information about the clump population. Such an analysis could even be extended to higher-order statistics, although it is not 
clear when the data might be available to examine those. Another goal would be to develop a more complete description of 
how the spatial distribution of clumps affects the various lensing quantities. We can think of this in terms of a spatial kernel 
that gives different weights to clumps at different distances from an image. We have identified the behavior of this kernel 
in the "near" and "far" regimes, but it would be nice to understand the full spatial kernel or at least identify the spatial 
moments that are most important for the total potential, defiection, and shear. 



APPENDIX A: AVERAGING OVER THE CLUMP POPULATION 



In this Appendix we specify how to compute averages over many realizations of the clump population. Consider any quantity 
/ that is a sum of contributions from individual clumps, / = fi, where i is the clump index. The general average over 
clump realizations has the form: 

(/) = J f ^^Y].Pu,{wj) pm(mj) dwj dm^l = ^ j fi Pw{wi) Pm{mi) dwi drui = j fi Ks{wi) pm,(mt) dim dm^ (Al) 

The first step is the formal definition of the average over clump populations. In the second step we use the fact that Pwiwj) 
and Pmirrtj) axe normalised so they integrate to unity for j ^ i. In the third step recognise that all terms in the sum are the 
same, so we can replace the sum with multiplication by A*'; and we use eq. (|26|l to rewrite pwiw) = Rs{w)/N {m). Finally, 
since all the lensing quantities are proportional to the clump mass (cf. eas. ll8lt20|l . we can define fi — fi/rrn to be a quantity 
that is independent of mass. The factor of rrii goes directly into the mass integration, which becomes J rrii Pm{rni) dnii = (m), 
so we obtain 

(A2) 



(/) = J fi l<-s{Wi) dWi 

Note that in this expression the i is arbitrary; it simply indicates that when evaluating the remaining integral we consider 
only an individual clump. 

Now we consider the covariance between two quantities / = "^^i fi aud g = "^^j <7j- We can write the covariance as follows 
(note the complex conjugation, which makes the covariance matrix Hermitian): 



cov(/,5) = {f9l-{f){g*) 

= ^ / fi gl Pwiwi) PmijUi) dwi dnii 



(A3) 



/ 

J 

J fi Pwiwi) Pm(mi) dwi dnii 



+ 









i(mj) dwi dnii dwj drrij 



= N 



1 



fi 9* Pw{wi) PmijUi) dwi dnii 
fi g* iis(wi) pm{mi) dwi dnii 



- N 



j g*j Pw{wj) Pm{mj) dwj dnij 
fi Pw{wi) Pm{mi) dwi dm 



N {my 



g*j Pw(yoj) pm{mj) dwj dm-j 
fi g'j Ra{wi) Ra{wj) Pm{mi) Pm{mj) dwi dmi dwj dmj 
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In the second step we write out the averages, using the fact that Pw{wk) and Pmiirik) integrate to unity for all k ^ and 
grouping terms in the double sum according to whether i — j or i j. We then recognise that the second and third terms 
have identical forms, so we just need to determine how many total elements there are in these sums. There are N{N — 1) 
elements in the second term (sum over i ^ j), and iV^ elements in the third term (full sums over both i and j), yielding a 
total of A'^ elements with an overall minus sign. Also, there are A'^ elements of the sum over i — j in the first term. This allows 
us to obtain the simplified expression in the third line. Finally, we replace Px{w) = K,siw)/N (m) to obtain the expression in 
the fourth line. In this expression, note that the first term is independent of the number of clumps N, while the second term 
is O (l/N). When the number of clumps is large, the second term is negligible and to good approximation we have 

cov(/,ff) « J fi g* Ks{wi) Pm(mi) dwi drui = -^^y- J fi g* Ks{wi) dwi (A4) 

where (as before) we define fi = fi/rm and gi = gi/rm as quantities that are independent of mass. 
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